15 research outputs found

    Adversarial Domain Adaptation for Duplicate Question Detection

    Full text link
    We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions. As finding and annotating such potential duplicates manually is very tedious and costly, automatic methods based on machine learning are a viable alternative. However, many forums do not have annotated data, i.e., questions labeled by experts as duplicates, and thus a promising solution is to use domain adaptation from another forum that has such annotations. Here we focus on adversarial domain adaptation, deriving important findings about when it performs well and what properties of the domains are important in this regard. Our experiments with StackExchange data show an average improvement of 5.6% over the best baseline across multiple pairs of domains.Comment: EMNLP 2018 short paper - camera ready. 8 page

    Automatic Fact-guided Sentence Modification

    Full text link
    Online encyclopediae like Wikipedia contain large amounts of text that need frequent corrections and updates. The new information may contradict existing content in encyclopediae. In this paper, we focus on rewriting such dynamically changing articles. This is a challenging constrained generation task, as the output must be consistent with the new information and fit into the rest of the existing document. To this end, we propose a two-step solution: (1) We identify and remove the contradicting components in a target text for a given claim, using a neutralizing stance model; (2) We expand the remaining text to be consistent with the given claim, using a novel two-encoder sequence-to-sequence model with copy attention. Applied to a Wikipedia fact update dataset, our method successfully generates updated sentences for new claims, achieving the highest SARI score. Furthermore, we demonstrate that generating synthetic data through such rewritten sentences can successfully augment the FEVER fact-checking training dataset, leading to a relative error reduction of 13%.Comment: AAAI 202

    The Limitations of Stylometry for Detecting Machine-Generated Fake News

    Full text link
    Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake news by capturing their stylistic differences from human-written text. These approaches, broadly termed stylometry, have found success in source attribution and misinformation detection in human-written texts. However, in this work, we show that stylometry is limited against machine-generated misinformation. While humans speak differently when trying to deceive, LMs generate stylistically consistent text, regardless of underlying motive. Thus, though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information. We create two benchmarks demonstrating the stylistic similarity between malicious and legitimate uses of LMs, employed in auto-completion and editing-assistance settings. Our findings highlight the need for non-stylometry approaches in detecting machine-generated misinformation, and open up the discussion on the desired evaluation benchmarks.Comment: Accepted for Computational Linguistics journal (squib). Previously posted with title "Are We Safe Yet? The Limitations of Distributional Features for Fake News Detection

    Towards Debiasing Fact Verification Models

    Full text link
    Fact verification requires validating a claim in the context of evidence. We show, however, that in the popular FEVER dataset this might not necessarily be the case. Claim-only classifiers perform competitively with top evidence-aware models. In this paper, we investigate the cause of this phenomenon, identifying strong cues for predicting labels solely based on the claim, without considering any evidence. We create an evaluation set that avoids those idiosyncrasies. The performance of FEVER-trained models significantly drops when evaluated on this test set. Therefore, we introduce a regularization method which alleviates the effect of bias in the training data, obtaining improvements on the newly created test set. This work is a step towards a more sound evaluation of reasoning capabilities in fact verification models.Comment: EMNLP IJCNLP 201

    Multi-source domain adaptation with mixture of experts

    No full text
    Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from PDF version of thesis.Includes bibliographical references (pages 35-37).We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources. The key idea is to explicitly capture the relationship between a target example and different source domains. This relationship, expressed by a point-to-set metric, determines how to combine predictors trained on various domains. The metric is learned in an unsupervised fashion using meta-training. Experimental results on sentiment analysis and part-of-speech tagging demonstrate that our approach consistently outperforms multiple baselines and can robustly handle negative transfer.by Darsh J. Shah.S.M.S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc

    Contrastive Text Generation

    No full text
    This thesis focuses on developing summaries that present multiple view-points on issues of interest. Such capacity is important in many areas like medical studies, where articles may not agree with each other. While the automatic summarization methods developed in the recent decade excel in single document and multi-document scenarios with high content overlap amongst inputs, there is an increasing need to automate comparative summarization. This is evident by the number of services for such reviews in the domains of law and medicine. Building on a traditional generation pipeline of planning and realization, I propose models for three scenarios with contradictions where the planners identify pertinent pieces of information and consensus to adequately realize relations between them. First, I tackle contradictions between an old piece of text and a claim for the task of factual updates. As there is no supervision available to solve this task, our planner utilizes a fact-checking dataset to identify disagreeing phrases in an old text with respect to the claim. Subsequently, we use agreeing pairs from the fact-checking dataset to learn a text fusion realizer. Our approach outperforms several baselines on automatically updating text and on a fact-checking augmentation task, demonstrating the importance of a planner-realizer pipeline which can deal with a pair of contrastive inputs. Second, I describe an approach for multi-document summarization, where input articles have varying degrees of consensus. In a scenario with very few parallel data points, we utilize a planner to identify key content and consensus amongst inputs, and leverage large amounts of free data to train a fluent realizer. Compared to stateof-the-art baselines, our method produces more relevant and consensus cognisant summaries. Third, I describe an approach for comparative summarization, where a new research idea is compared and contrasted against related past works. Our planner predicts citation reasons for each input article with current research to generate a tree of related papers. Utilizing an iterative realizer to produce citation reason aware text spans for every branch, our model outperforms several state-of-the-art summarization models in generating related work for scholarly papers.Ph.D

    Capturing Greater Context for Question Generation

    No full text
    Automatic question generation can benefit many applications ranging from dialogue systems to reading comprehension. While questions are often asked with respect to long documents, there are many challenges with modeling such long documents. Many existing techniques generate questions by effectively looking at one sentence at a time, leading to questions that are easy and not reflective of the human process of question generation. Our goal is to incorporate interactions across multiple sentences to generate realistic questions for long documents. In order to link a broad document context to the target answer, we represent the relevant context via a multi-stage attention mechanism, which forms the foundation of a sequence to sequence model. We outperform state-of-the-art methods on question generation on three question-answering datasets - SQuAD, MS MARCO and NewsQA.DSO (Grant DSOCL18002

    Nutri-bullets: Summarizing Health Studies by Composing Segments

    No full text
    We introduce Nutri-bullets, a multi-document summarization task for health and nutrition. First, we present two datasets of food and health summaries from multiple scientific studies. Furthermore, we propose a novel extract-compose model to solve the problem in the regime of limited parallel data. We explicitly select key spans from several abstracts using a policy network, followed by composing the selected spans to present a summary via a task specific language model. Compared to state-of-the-art methods, our approach leads to more faithful, relevant and diverse summarization -- properties imperative to this application. For instance, on the BreastCancer dataset our approach gets a more than 50% improvement on relevance and faithfulness
    corecore